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ABSTRACT 

Determination of parentage is fundamental to the study of biology and co applications such as the 
identification of pedigrees. Limitations to studies of parentage have stemmed from the use of an insufficienr 
number of hypervariabtc loci and mismatches of alleles that can be caused by mutation or bv laboratory , 
error and thai can generate false exclusions. Furthermore, most studies of parentage have been limited 
to comparisons of small numbers of specific parent-progeny triplets thereby precluding Large-scale- surveys 
of candidates where there may be no prior knowledge of parentage. We present an algorithm that can 
determine probability of parentage in circumstances where there is no prior knowledge of pedigree and 
that is robust in the face of missing data or mistyped data. We present data from 54 mai^.hybrida and 
586 maize inbreds trmt were profiled using 19:> SSR loci including simulations of additional levels, of 
missing and mist\ped data to demonstrate the utility and Flexibility of diis algorithm. 



DETERMINATION of parentage is fundamental co 
die study of reproductive and behavioral biology. 
The increasing availability of highly discriminant ge- 
netic markers for many diverse species provides the 
potential to uniquely characterize individuals at numer- 
ous Loci and to unambiguously resolve parentage where 
genealogical relationships are unknown, in error, or in 
dispute- 
Identification of parent-progeny relationships in wild 
populations of animals and plants provides insights into 
the success of various reproductive strategies (Ell- 
strand 1984; Smouse and Meagher 1994; Alderson 
et aL 1099) and has allowed for the implementation 
of management programs to conserve genetic diversity 
(Miller 1975; Ran n a la and Mountain-4997). The 
association of pedigree with physical appearance or per- 
formance in domesticated animals and planes allows 
parents that have contributed favorable allcles-for desir- 
able traits Through selective breeding programs to be 
identified (Bowers and Meredith 1997; Silfc et aL 
1998; Vankan and Faddy 1999). These applications of 
associative genetics facilitate further progress in genetic 
improvement through breeding. Establishment of par- 
entage is also useful to secure legal rights of guardian- 
ship in humans, to help protect intellectual property in 
plant varieties, to validate breed pedigrees or domesti- 
cated animals, to protect stocks of fish, and to identify 
provenance of meat that is available in supermarkets 
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(Cotz and Thaller 1993; Primmer et at. 3000; White 
etaL 2000). 

Most studies of pedigree have utilized exclusion analy- 
sis where the molecular marker genotypes of cither one 
or a restricted number of potential triplets of offspring 
and putative parcno arc -compared. Oftttn the identity 
of the mother Is not in question; die maternal profile 
is subtracted from that of the offspring and the deduced 
paternal profile is then, compared with candidate father 
genotypes (Ellstrand. 1984; Ham^ick and Schnabkl 
1985). Individuals who could not have contributed the 
paternal genotype are excluded; the remainder arc pos- 
sible parents. Nonpaternity in hum mis is generally de- 
clared only on the basis, of exclusions exhibited by 4p 
least two unlinked and independent loci. This criterion 
of exclusion reduces the likelihood of a false declarauon 
of nonpaternity on the basis of marker results that are 
actually, due to mutation within the phylogcny. Bsim et 
a/. (1998) show that evidence of nonpaternity should 
require exclusions at loci on different chromosomes to 
avoid erroneous conclusions that would be made due 
to nondisjunction at meiosi.s leading to uniparental in- 
heritance. A requirement tor at least three independent . 
exclusions to declare nonpaternity in humans also 
been instituted (Gltnn et at. 1997). In studies of natural 
populations of animals or plants where numerous par 
ent-progeny triplets are examined it Is usuid to accept 
a single exclusionary event as evidence of nonpaternitv 
(Marshall et al. 1998), Paternity testing has been ex- 
tended to situations where DNA from either parent i> 
unavailable. Fur example, paternity can still be estab- 
lished in circumstance where the putative father i> dr- 
cea-srd hut his parents are .still alive (Hn.MfNK^ <:■'. 
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1<k»1: ISocki-.t. rt <il. IV"'^; ( H\u\;-;i>k'iv '-re;/. I WY-'-r. 
demonstrate that pale' tii[\ t .m l>c di a. r::;ined :u c im 
where the mother is umnailabk tor testing. L\M. ^ 
.'.-/. partial/-- ix: u.u -a the UNA pi<>ld<. oi' .i 

missini? crocodile pa: em umii^ prohles of the mother 
and progeny. 

Chakrahoktt W rj. i IV 1 ^) and SMot^f. and MlaCmkk 
(1994) report that reliance up<>» exclusion alone has 
usually failed to unambiguously resolve paternity. Limi- 
tations have stemmed from the use of an insufficient 
number of independent hypervariable loci. Other statis- 
tical methods arc therefore required to calculate tht: 
likelihood of paternity for each nonexduded male 
" (BtRRV and GET55tft 1986. Meagher 1986; Meagher 
- and Thompson 19S6; Thompson and Mwchkr 1937; 
DtVLiN a/. 19S3; BtRRY 1991 ). Marshall "rt al (1993) 
draw attention to the quality ofdata that is encountered 
practically in genorypic sui-veys. Maternal genetic data 
may or may noc be available, data may be absent for 
some candidate males, data may be missing for some 
luci in sonic individuals, null alleles exist, and typing 
errors occur. Reconstructing or validating the pedigrees 
of varieties of cultivated plank often provides additional 
challenges because their phylogenies can revea! appar- 
ent exclusions that masquerade as non-Mendelian in- 
heritance. For example, apparent exclusions can occur 
in circumstances where an individual is used as a parent 
prior to completion of the inbreeding process' The de- 
velopment of parent and progeny men continue on 
parallel butseparate tracks thereby allowing the possibil- 
ity that alleles that are subsequently lost through in- 
breeding in the parent can still become fixed in the 
l> progeny. It is also possible to create many offspring from 
" a single mating and to use the same parent repeatedly 
in "backcrossing." Therefore, many individual inbred 
lines, varieties, or hybrids can be highly related. In con- 
sequence, there are numerous (and often very "similar) 
pedigrees. The effective number of marker loci that can 
discriminate between alternate pedigrees is proporiion- 
alW reduced as parents are increasingly related. Conse- 
quent^ inbred lines can be more similar to one or 
■ move sister or other inbreds than those inbred* are to 
one or both of their parent*. 

Tt has not been usual to search among hundreds of 
individuals to identify the most probable maternal and 
parernal candidates for a specific progeny Most studies^ 
of parentage are in circumstances where there is a prion 
information tor at least one of the parents (usuallv the 
maternal parent). Limited availability of marker loci and 
the lack of very higlvth rough put scnolyphig y^rems 
offering inexpensive daiapoim cost.'* may h;ive ro^^ed^ 
research on studies that involve relabel v few individual 
and w!k*' ~ there is at Iea*t mine << :*r:t»i ii"id:ca::or. =u 
pa: vnt.^f. Studies that have Iktm i-nndneied a;c:\<a:l -i 
bnrm n l t oi-i- ;it:oi; on patrnra^c :m hidt >p< tars whfM- 
i m: oi 'iucti'. r \ >rh,r. i< n \ riutci > ide: h :Ju auun = .-t 'he '"-i- 
a- it ..I Lj.uriu dithmlt oi ini|H'N>ib^ - F.\. ur.pl >_ < Iik'.ihU- 



.m' i l 1 1 1 1 rii'.kcu on Wi rcN thai \>\ m i;< r b-X'd pa;;^;l- 
hii> :Ai i>Mi<o.s ct ai U^J''. ci eMia-pai: Copulation 
:Un-ro.\ et at. 19'J^; or on species Mich as the wombat 
i 1 1..' difiicdc to oh-er.v \\\ »!:•: \*\U\ 'T.WI.Ok >t nK 

i-voti. 

Two circumstances favor a revised approach to the 
statistical analysis of pedigree First, mo.cCular marker 
technologies, art rapidh developing and will allow nu- 
merous loci to be typed for thousands of individuals 
rapidly and inexpensively. A greater number and diver- ^ 
bitv of larger-scale studies of pedigree can be expected 
within the plant and animal kingdoms including individ- 
uals in which there is no prior-knowledge of pedigree. - 
A larger number of markers mean a greater chance 
for errors. Therefore, the second circumstance follows; 
Procedures that are efficient and robust in .the face of 
apparent exclusions, missing data, and laboratory error 
arc required. 

The purpose of this article U to describe and evaluate 
a methodology thai can be used to quantify the probabd- 
i:\ of parentage of hybrid genotype, UV focus on par- 
entage because it is the primary fores of published lite ra- . 
cure and it is die easiest level of ancestry to underhand. 
The method is robust in the face of mutation, pseudo- 
nori-Mendeiian inheritance (apparent exclusions) due 
to residual heterozygosity in, parental seed sources, miss- 
ing data, and laboratory error. The, methodology has a 
number of advantages: (i) Tt can accommodate large _ 
datasets of possible ancestors (hundreds of inbreds or 
hvbrids each profiled by >100 marker loci), (ii) it does 
not require prior knowledge about either parent of the-. 
* hybrid of interest, (iii) it does not require independence 
of the markers, and (iv) it can successful^ discriminate 
be ewe en many highly related and genetically similar ge- 
notypes- We demonstrate the effectiveness of this ap- 
proach to identify inbred parents of maize (Zta mays 
L.) hvbrids using simple sequence repeat (SSR) marker 
profiles for 54 maize hybrids together with their parental 
and grandparcntal genotypes included among a. totul >4 . 
of 586 inbred lines. The methodology-is applicable to ; 
the investigation of parentage for all progeny devclopccJue 
from parental mating without subsequent gene rations 
of inbreeding. 

MATERIALS AND ' ME'1'HODS 

Algorithm; Consider an index hybr id whose* parentage is 
unUsiowTi or in dispute. Inbrcdi in mi available database are _ 
possible ancestor? of the rnbrifl. Tbr ;>bjc*< lis c ii to find rhe 
" probabilities oT closest anct-s:^ fuf t<u^ mhrcd on the basis 
■y t inionuanoii horn SSRs irom tht- inde^ hybrid and tac 
i!ib;-i:ds. There \s :u> rciioi; to Lr'in the* datable by removing 
inbveds thought to be unrf'.i'cri rr. thr iitdrv hvbrid because: 
the i t UiCk Ol rclatioiiship ^ M br cU^ -r^^rr.. 

(...insider .1 pai; ui" »>sl '^[e .inu.'M"i\ : .:ibti:d / .uui iiibsCC; 
; [ ht. :C is raithiu^ spf(.:.u jbiM.L i'r*.> p,n li< L;'.,ir p^:'' ali 
(Mir* will h<. trenr«-<* .sitml.i: U . Tin pi i'>m i:iw>\<.^ '"• «!*" ulau; r n 
ii . -.jmb.ibiliiv rh.i: inb.-.-rS . >\i\d ; 111 i h. r hvi)i-fi\ ancrsir>. 
this l<>r all ■ t "^laK ; • :ht; d-n:ibiisr. 
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l iiL" fhrx- "J ilii al-'-::'il"< rule (' - ■ ^ Kk*. 

I'.HUji. L.j-i H <■. / SSRjm Maud h>r thr l jjo^u i i< >i : : * I u I : i 
ih.ir .'-.nit I j .ire a;KrMoi^.i t::r indv\ hybi id -v.er. : he mi: ►rm.i- 
( ion from j he \ arm: is SSRs. I.cf >u:irl Mr rlio 

tiuna. ■;■><' p:*K>i\. proba y.lll'. ^ the satlK ewni. Ka.i v. . 
f:\SSRjilf. /")' K thr pnbabil.'A of observing :he various SSR 
resuibi it m fact y :md ; arc ancestors, lines' rule .sa;.s 

,«S5IUi = msR< * PH. ^!/ 3 :S5Fo^'. >■■ :< '"1 «. -•■'!■ 

where the sum in the denominator is uvcr all pairs of inbreds. 
indexed bv :t and w. / J (SS^:'.;) x /) » t»-e of the cenn> 
hi the denominator. ;To compute the denominator tn ::ic 
above expression, fix a particular order co die inbreds in rhr 
database and cake u < v m expressions involving the pair u. 
v). If there are 586 inbrecU. for CNamplc, then the number 
of pairs and rhe number of terms in the denommamr is 
586(587)/2 = 171.991.) Inbreds i and j may be parents or 
grandparents or other types of relation* or bear no relation- 
ship at all to the hybrid- If there are more than two ancestors 
in the database, such as both parents and all four grandpar- 
ent then the possible pairs involving these ancestors will 
gcncrallv have the highest posterior probabilities. If the hy- 
brid's true parents are in the datable, ihtn a* a pair they will 
typically have the highest overall posterior probatvlitv. It'hnrh 
i and y'happei. to be vcla;ed ro one pitrdcuhir parent ci'thc 
hvbrid, then a* a pair their posterior probability will be low 
because thev will not usually account For many of the nilctes 
that arc contributed by the other parent of die hybrid. 

Wc will make the ■"iio-pnor-information*' assumption that 
r\», v) is the same for all pairs («, v). This implies that Lhis 
factor Is cancelled from both numerator and denominator in 
the above expression, giving: 

Pri..;ISSRs) = P(SS^ij)/y,P{SS^sUt, v). 

The problem is then to calculate a typical /*(SSRsU.;) . Assume 
inbreds /and; arc both ancestors. We calculate the probability 
of observing the residting hybrid under this assumpdon. We 
make no assumptions about relationships among the various 
inbreds. Other possible ancestors will be considered implicitly 
in the calci/adon by allowing their alleles to be introduced 
through breedings with / and j. However, the nature of such 
breeding is not specified. Suppose inbred fs alleges arc (..i. 
b). E--ich descendant of inbred i receives one of these rwo 
alleles or not- Aii immediate descendant receives one with 
probability 1 (barring mutations). A second generation de- 
scendant recc'ves one of them with probability O.v. And so 
on. Since de<nxc of ancestry (if any) is unknown, wc label rhe 
actual probability of passing on one of '-hese alleles to be P. 
Similarly, an allele from inbred j has been passed down to the 
hvbrid or not. and the probability of the former is P. In the 
following-, / J will be taken to equal 0.50, although we will aisn 
consider P = 0.99 in some of the calculations- 

Assuming P = 0-50 is consistent with the closest ancestor* 
in thr database being grandparents. However- we are *ftbi 
interested in grandparents prr sr. If the doscsr ancestors in 
die database *.vere parent, then as indicated above P should 
equal I (-ig norms' mutations and laboratoiy errors 5. Our pri- 
ma^ coTM riv s when the parens arc not :r. :he daiabasi. In 

duS Case Pi- :io greater than 0.50. Aiiuming r* = 0.50 :> rolv.iit 
over ihe middle range nf possible values of /* One wnv in 
which it :s ■■ob'.^t it rhere nuiv be mutaiioiis and ^ib^mr^n 
CtTOi".*. -n wruLh case P ^vould ha\r ro he <1. T:i.-xi:r^ 10 
etjuni ' 3i» io'.ri little peiiidt^ against a pai ;u yiU- p:ur <u .du<;ii 
there tsai. ..pr.-.u riu t-v-chisioii iVtmi du eU pa : *:^::^ r Tl.fe- 
fore :.ik:i;-_' '" h t <! means if the Li at- : JaiO:i r > .ii'<; m 

;he rl.n.i.i.iv^ 'ML'' --he. 1 , will v.m he ruled ort il"i'ue:>' '-.ijrM n 
\n he mi.'..i .■ a id la r>ora:«-,rv crvnrs. And if" :hc -t .nr '- 

Kjfs L c :h.-«!a Jmm: .*.rt nioi'C i'l (Uot^ rhau i^r;t n. a ia re . i > 'i:-.^ 



I " ■ :1U >.^In. Sl ' 

;uv .i.t-'.'. i-.j be klynnliyd h«;..,aw -'-k^ Hill ;imi:.'.'a h r- r die 
fc^ ,,N iiti^iralC N'> the Sik n - r in> d;-i 

V\ Men and rare a:ice^lu:s :heiv arc u.r.r rx»ssib:lilies: (ti 
The alleles of both inhrrds / and / wt ir pa-^ed to the hvbrid. 
i-Ji ;:il:r-_-d / c-m*w Ll:i:ui^h not inhred /. ia; Li;bicd / 
cmlic Through bur not inbvec: /. and r -h neither inbred r:uiic 
dn '.;a£h. Assutviing .ndcpcndciKe. du:sc hu»e respective prwo- 
■aSiliries P\ P\\ ~ P). P[\ ~ P) . i I - In the CMC P = 

C.'xJ. nil of :hcsc" probabilities crt j ;al (>.L*5. 

A:i histan.ce of the law of ror.al probability ;$er. 5.3. BkkkV 
1906) that the probability of obse^htj/ a hybrid's alleles U 
the average or the conditioj\al probability of this event given 
the aoove four cases. The simplest of :>.c four cases is the 
first possibility; .Assuming :he hybrid's allc'.es are passed down 
direxdy froi^ bo:h inbreds. the probab:l;:> of observing die ^. 
hvbri:l's grnot\pe is either 1 or 0 deper.din^ on whether the ^ 
hybrid shares both inbreds' alleles, (k is especially easy when _ 
both inbreds are homozygous.) The other three caj>cs require 
an assumption regarding the possibility that ?n inbred" s allele 
is not passed to the hybrid but is interrupted by a mutation, 
a Labomrnry error, or intervening hrreriiug. We regard such 
an allele as being selected from all known uUdw with probabil- 
ity 1/ (numnfr oraiiflfts). wnere rhe number of alleles is the 
toul number of alleles known to e\;st m die locus ia question. 
An akematUe approach v^ould be to osc the allelic prupor- 
cioa> that are present in the database (or in another database} . 
However, the lines in the database may ncn be randomly se- 
lected from anv popnla;:on. For example, a line that has been 
highly used in breeding would have many derivative Hues in 
the database, in which case the frequencies of its alleles wi:i 
be artificially inflated. Assuming equal probabilities for the 
various alleles at'a ^iven locus is robust in the sense that it is 
not affected by adding and dropping lines from :he database. 

There are many cases *o consider when computing the 
probability of observing a hybrid's alleles, depending on the 
zvgoMtv of the hvbrid and the' inbreds. and allowing for the 
possibility of missing alleles or "extra alleles" in the assessment 
of the hybrid and "inbred genotypes. These possibilities arc 
too numerous to list. Instead we give three simple examples. 
All the examples have homozygous inbreds. the must common 
case- And each of die three hybrids has two alleles, again the 
most common case. Wc suppose that the me;isured alleles for 
three SSRs and a particular tno ol" hybrid and ancestor inbreds 
are as we have indicated in Tabic l- 

For SSR 1 there are three known alleles, one in addition 
to nMcies n and b that are lLted for the three line* (hybrid, 
inbred (, and inbred j) in Table I. for $$R 2 and SSR 3 
there are two known slides * n nddirion to those listed. The 
calculations in. the rn>ht half oi "Table 1 will now be explained. 
Implicit in calcularing P'.SSR i,f) is rhe assumption— required 
irt both the numerator and denominator of Hayes* rule— that 
inbreds i and j arc jin-estors of :he hybrid. Consider SSR 1. 
In case 1 above, both ancestors' alleles (as measured bv the 
laboratory process) are assumed rr> pass teethe index hybrid. 
?nd so in this case the hvbrid is necessarily nb. The probabilir\- 
of observing the actual hybrid's genotype is 1 for case 1. as 
shoun in Table 1. In cjv^c 2. w* assume that inbred i's allele 
passes <-o rhe hvbi-id but inbred /i dots not. Indeed, the hvbrid 
has an allele The probabilirs of observing a !^as the other 
allele is l/(iKiinbei oi alleles? = 1/3. as snown in Table 1. 
Cast" '} :* similar.' In :nse 4, neither ancestor allele is passed 
to the hvbrid: the prr/nahiatv of nhs.-mnK the hybrid's i?er.o- 
r\pe M)rar.\ i.e leroL^-: > ^CV:«.M\ pe) :s 2(l-'3)(l/^) = 
Siiue 1° = O. t'), rhr-ne-ai; ' a: u ■.>i:di:io:i.d i piohobihn in u\c 
n;jh;iiiovl woiumn ■, \ ■ a-v '-s the mnplc :ivr-/a-e o( th«: tVur 
ca-icw, t \i i:idiLated v\ lc*'- c I. 

K( f SSR - and SSR : > lie <_a 'x i/ai a ins aie similar For 
2 dieie is v>mL- C'- ■ f r r. .--^r pair < .', j) bcinn alVC^'^rv 
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l) \ l;.wi* -t <-.: 
TAJrUK I 

Probability of observing a hybrid's alleles usinj- three sample SSRi And four possible combinations (cases) 
of alleles passed, assuming that inbreds i and j arc ancestors of r.ht hybrid 

Probabiliiv of nlwi-.n^ :hf 

hvbnH'.s i;c:iutvpc 
_ : ^ — Ocrol! 

\, ( . ,>i C^e 1 Cose 2 Ca>c 3 Case i piobabilirv 

SSR nl'.tk-s H\bric Inbred i In bn-dy f. / ' I not j n<tt t. j nvt i. not ; P[SSR\i.j) 

13 no an W> 1 1/3 ~ 1/3 2, 9 17/36 
2*5 bd to CV 0 1/3 0 2/23 7/ 100 

3 e ah rr D'l 0 0 0 2. '36 2/144 

S3R sr_v.p\r sequence repeat marker profile '■»■- 



but it is not conclusive. For iSR 3 there is even less evidence 
Favoring pair (:,_/). It would not take rnanySSRj *vuh evidence 
similar to [hat foi 5SR 3 to essentially rule out this pair- 
provided that other pairs are not similarly inconsistent. 

To find the overall /^SSRj!:, j) t multiply the individual 
P{SSR\ Lj) over die various SSRi. There are purely computa- 
tional i$$UC5 CO address. E":ich /^SSRI j) is a number bctwci'n 

0 and 1. When there arc a great many SSKs, the product of 
these numbers will be vnmshingly small. To lessen problems 
with computational underflow, for each SSR we multiple 
^(SSRIu. v) by che same constant for each pair ( u, v): '.he 
inverse of '-he largest possible such probability. For example, 
since 17/36 is the largest probability for a heterozygous hybrid 
at an SSR having rhree alleles (as is che case for SSR 1 in 
Table 1), we multiply ah factors P{$$Rt'<u, v) by 36/17. To 
eliminate remaining problems with underflow, wc do calcula- 
tions using logarithms (adding instead of multiplying) and 
take antilogs at the end. 

The probability P(SSRi u. v) is calculated for all («, u) pairs 
and summed over all possible pairings in the database, includ- 
ing chat for the inbred pair under consideration: (*, j\. lini 
gives the denominator in the expression for P(/. yiSSR*). 

To determine the probability that any particular inbred, ■>ay 
inbred i. is che closest ancestor of the index hybrid, sum 
ASSRW, v) oser all inbrccs vwith o =/ i Call :his P<J SSRsi. 
The maximum of fVISSRs) for any inbred ( is 1. But mhcc 
there is one closest ancestor on each side of the family, the 
sum of Z^/lSSRi) over all inbred* l is 2. Tf there is a partic: iUtr 
pair (z,y) for which PUylSSRs) is close to 1 then bnrh r^dSSRs) 
and P(ylSSRs) separately will be close to I. 

SSR data: DNA \>as est racted from 54 maize hybrids and 
from 38(5 maue inbreds. All of the hybrids and most inbreds 
are propt'ietnrv products of Pioneer Hi-Bred International: 
some important public I v bred inbred lines were abo included. 
The inbred p^rmcs and grandparents of each hybrid weic 
included within rhc set of inbred*. Other inbrcds that vvere 
genotyped include mnnv that arr highly related by pedigree 
to parents and grandparents of the hybrids. The hybrid* wr re- 
chosen becnusc each ha* a pedigree that is known to us ar.cf 4 
collectiseh thev represent a broad array of diversity of niai/.e 
gennpta$m that is mrrenth grown in the United States ranging 
from enrl\ ro '.arr m.itnru',. 

A to:al of 193 SSk. loci weft? used in r his sturiv followrrj; 
procedures dooi^fd in Smiti! W -d. ( I ^>y T") . hir :no<lirird a> 
described hciou. SSR ;uu were chosen on rlic bails tha; rhe> 

imlividaj.U 'i.r- h-.-t-:i ',li<.'^n t«. - ha^'f a liii^h p««»tfr oi '.a>t"."init- 
IU1LIOM ..J'K t"'; imi;:' i ; I > i r ; 1 1 1 1 tv i a ] d c t > 1 1 a t u r I \ I Ik*;. |j:i>»t»ii 
for a saiu;)'::i^ >: !^' foi t'.irh rhronuwonir ann. t >: t he*-r 
S^R loci. ' l ir t". . * • [ a m i nibt. : ^ : in ^.'roHrhc^t'-* i \^». : f loi i rti 
oil ilUliv • " ; i 1 f ■; '" ' < mi it »"t >!m--> <iS IoII<v.*n: I 'J Jii 

;t rirj). i .*J«':. i'^ h 7 (ti). s (IN>. P ■ I'J't. .1 !■' 



.(14). 17 ^>SR loci have not yet been mapped. The correlations 
among the loci are unknown and are irrelevant for our meth- 
odology. ~ 

Sequence data for primers that allow many of these (and 
Other) SSR loci to be asynve-ci are available at vrebiite http;,/ 
w^u'.a^ron. niiwouri.edu. AH primers were designed to anntral 
rsi'.d ai::pl:fv under a iu-^It: <n\ of condaious for PCR in I 0-a! 
icactions Genomic DXA (.0 n^} was amplified in 1.5 jUM 
MgCL, 50 ttim KCl, 10 m:a Tris-Cl (pH 8(3) u$ing 0.3 units 
ArnpUTaq Gold DXA po!ymei-\.se (?E Corporation) oligonu- 
cleotide primer pairs {one primer of each pair was fluorea- 
cendy labeled) at 0.17 um and 0^ mM dMTPs. This mixture 
\vas incubated at 95° fnr 10 min f'not 5 tart) J amplified US\t)g 
45 c\xles of denaturation at for 50 sec, annealing at 50° 
for 50 sec. extension at 72° for 35 bee; and then terminated 
at 72° for 10 mill. A water bath thermocvcler manufactured 
ac Pioneer Hi-Bred. International wai used for PCR reactions. 
PCR products were prepared for electrophoresis by diluting 
3 p.i of each product to a totai of 57 jxl ujsing a combination 
of PCR products j^eneriitcd from other loci for that same 
maize genotype (multiplexing) and/or dH20, Dilution af 1.5 
(a! of tHU mixture to 5 u.1 wiih jr^l loading; dye was performed; 
it was then elcctrophorcsed at 1700 V for 1.5 hr on an .\3I 
model .377 automated DNA sequencer equipped with GENE- 
SC\N SO frwa re v. 3 0 (PE-ApplieH Biosystems, Foster City, CA) . 

PCR products were sized automatically using the "local 
Southern" sizing algorithm .(Elo^h and SOUTHERN 19871. 
.-\fter sizing of PCR products u_sing GencScan, allele! were 
assigned using Genocvper iotn^are (PE-Applled Biosystems). 
Generally, allele assijrnations for^each locus were made on 
the basis of histogram plots consisting of 0.5-bp bins. Breaks 
between the histogram pioti of > 1 bp were generally consid- 
ered to consdtutc separation benveen "allele bins: however, 
other criteria, such as the presence of the nooremplate* 
directed addition of adenine (-A addition) and naturally 
occurring l*bp alleles, were used on a ma rkcr-by-marker basis 

CO define the allele dictionary .\ll allete sCOteS" were made 
wuhuui knowing ;he ide-'dcie^ of :hc ntlri?e*gcnot>pei. 



RESULTS 

Table ^ presents the yjo 'lx:b;iit\ of closest ancestry of 
the top five rankiny; inbr-d \u\e* for each of 5 hybritU 
m 0.50 (T;.iblf <2A) .-.ml : J ~ 0. 1 .»9 (Table t2B). Proba- 
bilities oT iUict-'itrY art- -lu-wn lur all >4 bybrids \hr 
t<>p uuikiug ii\brnl> in F-'.-c I P - 0.i>0 (Figure U: 
and P= ffi.t,'itrf lb. U.-m:1;.s for the hvbriri> p:v- 

<ctlted ill Table ;m- I- n :r-d n ilir top <jf Figure I. 



Received from < 515 3Ji 6883 > at 2/28/03 12:05:06 PM [Eastern Standard Time] 



02-'26/03 FRI 12:29 FAX 515 334 6883 



PIONEER HI -BRED DSM 



g}085 



TAUl £ 2 

Probability of ancestry of hvc hybrids usini; dau obtained front 30, 1 00, and 195 SSR lod 



3D liMI .ii( t I 1 *'* 



1—* . i,, ! 

riv Oct 


111 -X.. 


r . o [ j i 


O C 




hvb. 




\'\r,d. 


P:ut). 












A. A-minil:'.^ ^ = 


0.5' t 








3 417 


SPL , 


09607 


0.0123 


PI 


0$ 4 19 


0.0232 


PI 


1.00'in 


E-07 




P2 


0.5077 


0.1963 




0.8 11 1 


0.2235 


P2 


0.9957 


O.Wj3 




D2P2 


0.101(5 


0.l03Hfc 


Dl P2 


0^59 


0 2235 


D 1 P2 


0.00 IS 


0.003L* 




D1P2 


0.09O7 


0.0927 


SPL 


O.i 2-13 


0.025 


D2P2 


E-06 


E J S D 




PI 


0.032 


0.0125 


DIP1 


0.0C09 


0.0002 


SPl 


E-06 


E-J n 7 - 


S523 


Pi 


0-3545 


E-07 ' 


PL 


0.9999 


<E-20 


PI - 


1.0000 


<E-20 


P2 


0.31^3 


EJ>7 


P2 


0.5437 


<E-20 


P2 


0.96j5 


Q.052S 




DIP5 


0.1699 


E-07 ... 


D~l P2 


0.4563 


<E-20 


D1P2 


0.0365 


0.0523 




CPl 


0.1441 


E-07 


CPl 


E-07 


E-IS 


SPl 


E-15 


<E-20 




GP2 


0.0110 


E-OS 


SIM 


E-07 


<E-20 


CP2 


E-16 


<E-20 


3556 


PI 


1.0000 


E-06 


PI 


0.9999 


E-10 ' 


PI 


1.0000 


- <£-20 


P2 


0.9616 


E-OS 


P2 


0.9997 


E-10 


P2 


1.0000 


^ t- OA 

<L-J0 




D1P2 


0.0340 


E-10 


DLP2 


0.0003 


E-l 4 


I s ! 1 P«*> 

LJ 1 r ^ 




^ E. — v 




GP2 


0.0043 


E-09 


D2PS 


E-05 


E-15 




C" 1 .1 


<rF-°n 




D2P2 


0.0002 


E-10 


D?.P2 




E-l / 


L'L'L _ 


r i 7 

r.- 1 / 


V. • 7 

E." 1 ' 


3903 


dip; 


0.9622 


F.-0S 


D'.Pl 


0.9305 


0.0O58 


Pl 


i.oooo- 


£-0*5 


SP2 


0.4927 


E-07 


SP2 


0.6280 


0.0976 


DIP? 


1 .0000 


E-0G ■ 




D2P2 


0.2536 


E-07 


D1P2 


0.2321 


U.Un i i 


U_l - 




E-06 




D IP^ 


0-1622 


E-07 


D2P2 


0.131" 


U.U.J / 1 


Yd 


t-u * . 


IT 1 




P2 


0.0565 


E-07 


PI 


0.0197 


U.UU.Ort 




C.-1U 


V.\ fi 

t*l Q 


3940 


P2 


0.999" 


0.0001 


P2 


0.9999 


E-05 


?"2 . 


1.0000 


E-09 


DIPj 


0.9203 


0.0009 


Pi 


0.9970 


0.00 11 


PI 


1.0000 


E-09-- 




PI 


0.06-fc8 


£-05 


D1P2 


0.0030 


f\ c\(\ 1 1 

U.UU 1 1 


U \. r- 


P T 1 






D1P1 


0.0127 


E-05 


D2P2 


0.0001 




I "\ p 1 T>*- ) 

uru . 


l- ' -r 
H- 1 t 






DP1P2 


0.0014 


0.0009 




0.0001 




u_ . _ 


F tO 

n-* t v 


F-1 ft 










B, A^SUminjf P - 


0.99 








3417 


SPL 


0.9995 


0.000 1 


Pi 


0.9999 


E-05 


Pi 


0.9999 


E-03 


P2 


0.3836 


0.1658 


P? 


0.9938 


0.0107 


P2 


0.9999 


E-»)8 




D1P2 


0.0722 


O.1029 


DIPS 


0.0061 


0.0107 


DLP2 


E-l I - r 


£-11 




D2P? 


0.0441 


0.0623 


DIPt 


E-05 


E-06 


D2P2 


E'M 


£-14 




Pi 


0.000 1 


0-000) * 


SPL 


E-03 


0 


SPl 


E-20 


£-21 


3525 


PI 


0-9999 


0 


PI 


0.9999 


0 


PI 


1.0000 


0 


P2 


0.3991 


0 


D1P2 


0.9749 


0 


P2 


0.6135., 


0.444G 




D1P2 


0.1008 


E-ll 


P2 


0.025 


0 


DIP2 


0.3364 


0.4446 




GP1 


E-05 


0 


D2P2 


E*20 


0 


GPiJ 


E-4S 


0 




GP2 


£-06 


E^17 


SPl 


E-24 


0 


D2P2 


E^\) 


0 


3D3G 


Pi - 


1.0000 


0 


PI 


1.0000 


0 


PI 


0.9999 


0 


P2 


0-9996 


0 




0.9999 


0 


P2 


0.9999 


0 




DIP2 


0.00O3 


0 


DIP2 


E*U9 


0 


D1P2 „ 




0 




D1PI 


£■11 


0 


D*P1 


E-21 


0 


D2PI 


-E-49 


0 




D2PL 


E-l 3 


0 


D2P1 


E-21 


0 


D:^Pl 


E-54 


0 


3905 


DIP1 


0.9999 


0 


DIP! 


0.9999 


£-08 


PI 


1.0000 


E-09 


P2 


0.9992 


r> 


P2 


0.9999 


E-06 


P^ 


09947 


E-09 




5P2 


0.0006 


0 


D 1 P2 


E-06 


E-06 


D I P2 


_ 0.0052 


£-11 




DLP2 


E-05 


0 


S?1 


E-07 


E-13 


D2P2 


E-IS 






D2P2 


E-06 


o 


D2P2, 




E-10 


DIPl 


E-25 


£-25 


3940 


P2 


0.9990 


E-0> 


P2 


1.0()i)U 


E-0S 


PI 


1.0000 


E-09. 


DIP2 


0.9999 


z-o* 


PI 


0.9999 


E-05 


P2 


I 0000 


E-09 




Pi 


£-06 


E-l 3 


n i P2 


~ tV>3 


E-05 


D 1 P2 


E-04 


E-24 




DIPl 


E-OS 


E-13 


D2P2 


£-12 


E-ll 


DP IPC 


E- 14 


E-14 




DP IP? 


£-12 


E-L2 


-Dp'i P2 


E-21 


E-21 


D2P2 


£-5') 


E-19 



Hvb<L.. hvhcicl; Inlvl . inbred: Pvob.. prob.ibdir: -r.uui.ud error, rct'eniiv* to chc m.i ^h.ll'A in m<r result 

of riiiu: PI. i).itrin one: P2 duiciil Va..'. ^P: SP2. t i i 1 : sibling oi parciu one lv t^o U 1 / O'P- 

cUrivMMv. of ;,.ii : ':h .,iif p.iirnr tvo :\>Ack , m'unl lir.es: D P I P2 ;\c f i'. H i- ^ nl lw.Ci p.ife:!! OUC 
an f 1 iJ.u-fi: 



Received from < 515 334 6883 > at 2/28/03 12:05:06 PM [Eastern Standard Time] 



02/28/03 FRI 12:29 FAX 515 334 6583 



PIONEER HI -BRED DSM 



[g]086 



a 

Hvbnds 



*4 1 • 






ill i-L 










1 < ■» \ 






A 






A 




.7 J J' > 














A 

LA 


j/ll. 














39-40 
























^ A A 








3*62 


_ 




A A A ' .* 

A A A _ ^ 










3 1 63 






A 






A 




3 \ S9 




o 


A v 










3 1 A 1 2 




o 


A 


r 






32*45 


- 


c 


A 






v 














A 






3+K6I 




o 












3335 






X 


A 




A 




3343 


o 






X 


>G A 


A 


334R 




o 








A 


vx >■ 


3352 




o 






A A - 


A 




3373 




c 


A 




A 


X 




33G26 




o 


XX 








33T90 




o 












33YI8 




o 


A 






n a 


3** \ 1 




o 


A 

A 


A 


^- 






3 -4 $9 






A A 










3-iy i 




o 


Li XX 










3496 




o 




X 


X 


V 

f s 




34B15 




o 


X X 






y 




3-4G8 1 






A A AA 










3 5 1 d 




o 






A 


A 


33*5 




o 


A > 










3540 




o 




A 


A 


x 




3547 




o 


A /ft 


X 








1559 




o 








A A 




35G3 




o 


A A 














o 


A A 










35B26 




o 


A A A 

A Ail 










35R5? 






AAA 










361$ 




o 






A 






36Y95 






A A A 










3730 




o 


X XX 










3771 3 
















3753 










.. — * 






37W 










A 


A 




3860 






A 


X 








3893 




-a 


Q A G > 


( * - 








38F70 












X X 




38P05 






o a: 










38R52 






u A6 










390? 




c 




A 


A 


A 


A 


3907 








A 




A 




391-i 






CA O A 










39KJS 






















CA 










XI 132R 






X V X 










XL13^S 




c 


X 


A X C; 







AA 



X 



I 

-5 



-10 



-15 



1 

-20 



•25 



-30 



AAA 



-Probability of ancestry (logic) 

Othcri ; ' " Parent 



S-Pl'P2 



ii V; f "■); all 3-1 rnrM-Ki> .mri '<»p r:i»ik : .rit; u'.birrU — rht^c wit!'. prOiVi bii i [\ 
.^^^.:lv.l■lt / J ^ 'I.'. 1 '. 1 Nji .ill ">4 b\hvicU 'op : \ i • k i n i; iniirciLx — ^tlluSL' 



Received from < 515 334 &883 > at 2/28.03 12:05:06 PM [Eastern Standard Time] 



02-'28/03 FRI 12:30 FAX 515 334 6383 PIONEER HI-BRED DSM 2)087 



Hybnds 

1417 
1S25 
35 5t> 
3*05 
304f> 
31*16 
3t&2 
3163 
3159 

3245 

3U55 

32K6 L 

3335 

3343 

3348 

3352 

3373 

3302$ 

33T90 

33YU 

34H 

3^$i> 

3*491 

3496 

34B13 

3514 
3515 
3540 
3547 
3559 
3563 
356S 
35B26 
35R57 
3615 
36Y95 
3730 
3733 
3753 
■ 3750 
3&60 
3S93 
3*F"70 
33P05 
38R52 
3902 
39C7 
3914 
39KJ8 
\0915A 
XI ! 32ft 
X11325 



FrGURK 1 C-)tJ!m Krti. 



a 



A 



A _ 



A 
A 



A A 



O A 

O 

o 
o 

Q 

o 



AO 

o 



AM 



A 

A £ A 
A 

A A 
A 



-A A/ 



A C 



x A 
A A 



X X X 



X 2 



A A 



A k: 



-5 



"1 

-10 



-15 



-20 



-25" 



-50 



Probability of arKcsiry (1q° :!j ) 



When die algorithm used P — 0.30. thv t^o o.k-. ect 
parents uvir; identified a> highest in prnhahiliiv *br -?S 



r.mkird in 



)!.l:lU'i 



■Figure 1:. Fnrc-Lich IMxuK 

.^\Y ;^d4, and \0i'i:»At. o<;i ; 
*a >p :uip p I : l c t * s . Thr ullu i pa:'ei 
-•• Jr. i sivrr: i nbr t:d t m' bv .it l ■. : tl >r t 



ivas a direct proven'-' wl that parent. Overall. I* 1 - iSM'c ) 
M purer. ml inh:Yd> xere coircctK idt^idricd Fmi 
vhi-ids where both p.irencs ranked ht<U m -^'l mid. d'< 
"*n£e or" pn ib.ihilidv^ :>>i p.u Mir d !ii;cs ih-u :\mked iu.M 
on; anient: .ill ot'.i : .:>;): rd> iu;i£-<jd lVot:i U>'HW> u , 
,na07 ; pare 'U: 1 1 hi us m:u!:i; *rmnd nuiiird holt' 
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1 .i)t to 0 Ob-Vl. fa S5 In i>i ids. both. parent had |>ml> 
abilities i>\ .i:;inltv in excess ufO.y^U. Pi ub.ibiliUvs ul 
ar.n.stn trii :u «:ipai rr.t> ih.it raiikcrl in fuM <»t j]K t 
pl^eo wete I'mm D.WJO to 0.7O54. for die majuiuyol 
hvbnd>. :Uv pt u: ubiiu;. of the third a: id iii^heH racked 
nonparcni.il mbred'wns a: or below H-CloV This indicate^ 
that there is usually *.er\ Htde uncertainty about c!u>e^t 
ancestors. 

When the -algorithm used P - 0.99 to examine each 
of the 54 hybrids, both parents were correctly identified 
for 52 (9ero) of hybrids ar.d for 98% (102/ 104) of the 
parents across all hybrids f Figure 1). Two hybrids (3914 
and X0915A), in which one parent was not ranked in 
the top two, were also in the subset not ranked in the 
top two assuming P - 0.50 (above). In both cases their 
ranks improved (both to third rank) and the actual 
parent was supplanted by an inbred that was a direct 
progeny of the corresponding parental line. For 49 hy- 
brids, both parents had probabilities of ancestry in ex- 
cess of 0.999. Among the 5 hybrid;, having a parent 
ranking second with a probability of ancestry below 
0.999, the lowest of these probabilities was C897B and 
the highest probability for a third ranking nonparent 
was 0.1023. For* most hybrids the probability for the 
third and highest ranked nonparentnl inbred was at or 
below E-10. 

Table 2 also addresses data analysis in circumstances 
where heterozygous loci occur in inbred lines or where 
a hybrid is scored for the presence of more than two 
alleles per locus. The presence of more than a single 
allele per locus in inbred lines is an infrequent occur- 
rence in well-main tamed inbred development and seed 
increase programs but is possible because ^3-5% of 
loci can still be segregating and unintended pollination 
from genotypes not designated as parents of the hybrid 
can occur. For hybrids, more than two alletes per locus 
can be scored when DNA is extracted from a bulk of 
individual plants ond because inbred parents are not 
homozygous due either to residual heterozygosity or to 
contamination or because one or more direct parenrn 
of the hybrid are themselves hybrids. The presence of 
more than one allele per locus in an inbred line and 
more than two alleles per locus in a hybrid therefore 
can he accommodated by multiple runs of the algo- 
rithm, each with a random choice of two alleles per 
locus, Consequently; standard errors in the case of ana- 
lyzing data from 1V5 Ion '-end to be very' .small because 
there were few loo where an inbred or hybrid sample 
(from a bulk cd individual plants; was scored for more 
than two alleles 

Marsh m i rial . 199*- have drawn attention kkho^ 
that can be e T ^ -r-tir'tvl m •;enocypir.£ survcw Tiic^r 
errors include hi vmm- d.ua. null alleles, and typing er- 
rors. U'r tliff!: 'i rr r^it;,tii.-d the tulvi^ne^ of th». 
algorithm bv c\.itit:t 'he effect* of modi Hc.ii io- i- mi 
die data hu lo b-r.r.d- :'4l7. 5525. 355b. :Vj(<5. .uut 



•Ui't'it Fir^t. we t'Lcaui d du ;iiii:il-i :' ill' SSH> ii*c:L frmn 
■he lull ^e: ul ID5 in M' 1 ,i;ui lL*.:: tu 50 i. Table 2;. Unl 
I' .">'> '.oc: generated uiL'^rrect ranking of one pare: 1 / 
fur each of two lubr.d- ::»417 .u:d :104U) and :or bum 

p;-:r:i*..> one hvbricl (."WO. All of ll'.eSe U'.OSt highlv 

ranked nonparental mbrcds were elnsclv rcta:ed ;o the 
true parents for each of die respective hybrids: six differ- 
ent inbred lines were involved. Four were direct progeny 
of the true parents (one with additional backcroises 
from the true parent) and two .were full sisters (from a 
cross of highly related in b reds) of the actual parent of 
die hybnd. Using lOOJoci resulted in correct parental 
rankings for all hybrids except for 3905 where neither 
parent ranked in first or second, place. Four inbreds 
outranked the Lrue parents of '3905. AM four nonparcn ( ts 
were closely related to the. respective true parents; three 
were direct progeny of the true parent of the hybrid 
(one with additional backcrossing to that parent) and 
one was a fiuTsister of the true parent. Use of data from 
all 195 loci corrected' the plaeenietiL for one of the 
parents of hybrid 3005. Two inbred* that were not par- 
ents of this hybrid remained ranked .more highly thai: 
one of the true parents.. Both were direct progeny of 
that parent, and one of these inbreds had additional 
backcrossing to that parent in its" pedigree. 

To address the consequences of laboratory and other 
sources of error, we artificially compromised data qual- 
ity bevond the level originally provided by eliminating 
specific proportions of alleles that had been scored (es- 
tablishing scenarios where various numbers of SSR al- 
leles were not scored) and by misscoring other alleles 
(establishing scenarios where various numbers of SSR 
alleles were scored incorrectly). We also" combined the 
scenarios of missing data and wrongly scored data. Table 
3 contains a summary of the results of making these 
modifications in the data. For all modifications we used 
data from all SSR loci and we also randomly chose SSR 
loci to create subsets of 50 and lOO'oci. In each case, 
the program was run '20 times for each hybrid/set of 
loci. When all 195 loci were examined, rcphcaiioiii dif- 
leieti only according'to the particular choice of alleles 
for loci where more than two alleles had beervscor^d. 

To evaluate robustness in the face of missing data or 
mistyped data, we simulated individual and combined 
categories of these data iridic hybrid and all inbred 
lines at levels of 2, 5.- 10. and 25 ( ~7 of the alleles for each 
of rive hybrids and all inbreds br-. rTn'd the level of error 
as originally scored bv the Ird>< >:Vn\>rv We examined the 
effects of these levels and r\p^ r>:~ error for three sizes 
of database: 50 lori. 100 Ion. .:nd all 195 scored loci- 
The vame live hybrids consider dm Table 2 were investi- 
gated. a4l7. 3323. 3558. 39'*' and 3940. One of these 
hybrids (3905) was chosen ^; it:>i. one ot iis parents 
did not rank amoii^ :he top v.* > lAue* ocu when the 
. ompletc and unrnoddied d : ua r-.,in all N.NR loci were 
u^d. 

t,\ini:ples nf i-obu^'.nc-s ;r : i I ' :•_■(.-' ifadditirmal error 
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fni \\\r IwhricN tiMr.^ :> nt ".*!) ,i:ul !(>') loci and 

li-ci, t :ic siiuwii in Tabic Li '.vlictv t lumbers- u\ parents 
; a I '. k 1 * ' L; m I. o r.hs: *.op two places are p: t/^c:i :cd Dtr?rada- 
iif/i: i:i the pivterenuai ranking nf p.unii inbrcd>. ac a 
:tt"*23'i. additional nhs-mg da:a s;iij\v-n f.-jr one 
tiybi kl (33i'5) with usage of 30. J 00, or all SSR loci. 
Dt ac.taLit.tri in the preferential ranking of parent in- 
breds at a level of 25% additional misscored data was 
shown for hybrid 3556. When both additional levels of 
missing and misscorcd data were simulated, degradation 
m the ability to preferentially rank inbred parents oc- 
curred for ali hybrids and for all sets of SSR (50, 100. 
and 105 loci) except for hybrid 3417 when data from 
195 SSR loci were used. Over all five hybrids, use of 100 
loci improved robustness from the use of 50 loci; use 
of 193 loci further improved robustness for four hybrids 
(3417, 3325, 3905, and 3940). The degree of improve- 
ment small, except for hybrid S905. 

We also ranked inbreds according to their probability 
ol'ancoiry of hybrids when both parents and all inbred 
derivatives and full-sister inbreds of the respective in- 
bred parents for each hybrid were excluded from the 
analysis. The re3ult$ are too voluminous to present here 
bur can be summarized as follows: Using P 0.50, a 
grandparent of each respective hybrid ranked into first 
place for 41 (76%) hybrids; probabilities ranged from 
0.4976 to 1.0 and most were above 0.9999. Other classes 
of inbred* that ranked in first position for probability 
of ancestry were inbrcds derived directlv by pedigree 
from a grandparent of the respective hybrid (DCP) for 
13% of hybrids, inbreds derived directly by pedigree 
from a great-grandparent of the respective hybrid . 
(DGGr*) for 97c of hybrids, and one class (2% of hy- 
brids) with an inbred ranked into first place diat w r as 
dirccdy related by pedigree to the gr cat-great-grandpar- 
ent of that hybrid. Inbreds that ranked in .second posi- 
tion were related to the respective parents of the hybrid 
as follows: Thirtv-one (57% of hybrids) were a grandpar- 
ent of the respective hybrid, 11 (20%) weae dossed as 
DCP, 7 (13%) were DGGP. 1 (29?-) was class DGCGP. 
and 4 [7%) were a great-grand parent (GGP) of the 
respective hybrid. Over all hybrids, two of the four 
grandparents ranked into first and second positions lor 
23 (13% of hybrids); three grandparents ranked into .. 
r.he niM three positions for 5 (9~ of hybrids). There 
were no instances where all four grandparents ranked 
into the first tour positions. Thirty hybrid^ had a grand- 
parent ranked into fust position using P = 0.99. The 
number of grandparents ranked into the top five posi- 
tions was ').S ; compared t*-> when /* = 0.50) The 
in; ruber of jrrandparents ranking into the top two posi- 
tions was 55 {coin oared 'O ~I when P ~- 0.~?(Y\ The 
nuan |>n muhiliiv of a grandparent ih-*; nm^ct! into the 
tuM two positions w-is i ' l . r JS^ . SH = 0,| i">4: when P = 

0.50 and D.^H'w) (SI) - -.siirn /' - 0.09. 



DISCISSION 

The prevalent use of paternit) indices deir.o:i>trates 
liiaf t aiU aut.t^r otis :n have explicit pt obahj'ii.t \ >l 
ance^tr. :< » distinguish among differ e:it pedi^'rrv Un- 
let uLu market profiles are rapid!) bcco"un s nunc ex- 
tensive and com effective to generate. Features that would, 
advani t; the .statistical analysis of molecular barker data"' 
to provide- explicit probabilities of ancestry include the 
abilitv ro calculate probabilities of ancestrv where there 
is no n priori information as CO theidendtv of one (usually 
the maternal) parent and robustness in the face of labo- 
ratory error. 

Maize inbred lines and hybrids provide a very exacting 
Set of materials for evaluating the discriminatory abilities 
of molecular data and statistical procedures that are 
employed to interpret those data. Hundreds of maize 
inbred lines of known pedigree together encompass a 
great diversity and complexity of pedigree relationships. 
Some inhrrfl lines can be very highly related and geneti- 
cally similar cine to their derivation from comniuu par- 
entage including from parents chat are themselves highly 
related. Consequently, relationship categories such as 
"sister" or "parent" when applied to maize inbreds 'usu- 
ally refer to closer degrees of pedigree relationship and, 
thus, of germplasm and molecular marker "profile simi- 
larity than those of the equtvalently named classes of 
relationship for animal species. Most mai?,e hybrids that" 
arc widelv used in the United States today are con- 
structed from pairs of inbred lines that are unrelated 
by,pedigree, each inbred parent having been bred from 
a separate "pool" of gcrmplasm, Various degrees of retat- 
edness are possible between hybrids according to the 
pedigree relationships among their constituent 1 inbred 
parents. 

Using P = 0.99 in the algorithm is more specific for 
identifying parents than using P - 0.50. However, P - 
0 99 ts less robust for identitVing'other relatives, such 
as grandparents. When the algorithm was run at P 
0.50 there were 6 hybrid* for which one parent did not 
rank arriong the top two most probable crenorvpes. For 
the remaining 43 hybrids the correct parents were iden- 
tified even in circumstances where od\er candidate in- 
breds included not only full-sister lines bred from re- 
lated parents but also inbreds e\en more closely related 
to the true parent by virtue of being backcross conver- 
sions of the inbred parent of the hybr id. Fhr each of 
the 6 hybrids whrt'e 0 nonparent ranked above a true*" 
parent, rh;it higher ranked inbred was alwavs either a 
NiSTer or piogenv of the our ranked true parent. The 
'".mge of peo'ii^ree relationships t\p:K.->M-;l 0\ tlit 
Valecnt coef "ficicn: of relatedness [Mai kco t |U4S^ char 
'a as enrompa»ed !w pairs of true p »trn's .md more 
h , it h 1\ ranked inl s ied rclati\ , es ol ihr ir'ue lvv: - . ^ w:is 
fr-Mii '/.H."i L 'U r-.j 0.:ihSiV A cot:l"ht~:r:ir v_>[ app: 
m.cfi .l t c!.inoT^hi p N. twctii inbrci.l .\ ^::<l A ^i;e;e 
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mint d A' ha> been bred hum .l cross i>l inbred* A .ire! 
B wlJl 'iciva t.ll n;if and I'aj .u : .(.Ii /l :) 1 aku'tj>H'.> • * I 

the parental inbred A. A MalcYor c< >efhcicut of relation- 
ship ol U.lXi.SO eio*e!v up;ji o\i i nan. n t ; relationship be- 
tween iubrcdi A and A" whr:*t: tour additional hack- 
crosses of parental inhrt-d A follow the initial cross of 
i nb reds A and B. 

Running the algorithm a:. P = 0.09 in comparison to 
f* = 0.50 raises the probability of ancestry for the parents 
while diminishing the probabilities fur the dure! and 
lower ranking candidate inbred lines. Use of the algo- 
rithm at P = 0.99 hicreased both the percentage of 
hybrids with borh parent* ranked in the first r.v'o po>i- ^ 
tions (from 89 to 965c) and the percentage of parental 
inbreds that were ranked first and second (from 94 to 
98%). Two hybrids (3914 and X0915A) did not have 
both parent ranked first and second when the algo- 
rithm was run at P — 0.99. For both of these hybrids 
the nonparental inbred that outranked the true parent 
was itself a product by pedigree from {.he true parent 
that had been created by an additional frmr h:iokcro$»es 
of LhuL pauctu; the Mulfmi coefficient of relationship 
between the parent of the hybrid and the inbred thac 
outranked chat parent for these nvo hybrids was 0.9636. 

Robustness was tested by evaluating die effects of us- 
ing data from different numbers of loci and by simulat- 
ing additional levels of missing and in is scored dara up 
to combined tevels of 25% error beyond that which was 
provided by die laboratory. From our experience, error 
rates of 5 to 10% can occur in SSR profiling of maize 
due chiefly to die combined effects of residual heterozy- 
gosity among seed lots and bv deficiencies in the scoring 
of he tero zygotes in hybrids. The additional levels of 
simulated error, therefore, include values (up to ^$5% 
- total error) that are well outside of our experience. For 
five hybrids that were examined, increasing die number 
of loci from 50 to 100 (with no additional missing or 
misscored data) did reduce the number of instances . 
where inbreds that were not parents of a hybrid out- 
ranked The true parent from four to one. Nonetheless, 
all of these more highly ranked inbreds» although they 
were not themselves the true parents of die respective 
~* hybrid, were either direct progeny or full sisters of the 
true parent (Table 2). Consequently, if such degrees of 
error can be tolerated iii respect of pedigrees for inbreds 
that are identified as parents of hybrids, then S5R data 
from 50 loci of equivalent discrimination ability are 
sufficient. Use of data from 50 loci also evidenced ro- 
bustness in the fare of up to 10^r additional level* of 
either missing or misscored duui; no degradation in ihe 
ability :o identify a parr \\ \ w-.i< apparent up ro die: level of 
10% additional error except for \ () c c additional miisinc 
and unsecured al!c!c> S>r one hvhrid (,35^3: Table 
However, use of I ( > ^ > h>ei hu" raised the proportion of 
true parems that w<.-:v omx-ctlv ;rln i f ifie.-r I from ?.y~r 
(for ">'> loo) im 71' - • mean correct parents o\er a!l 



H i mi \ l s •■>_; s_::; 

k-wlrs ul cr'rur: "fable 3). Use of dara trout l!i"> ]nci 
piwUvUr. -u/.uci :csi',e:uA a^.;ii:.>L ai'.diuon.il le*eU oi 
ei t'iM . H'uvrver. use uf data from 195 loei w,m unable to 
pti>'. l'.it.* re>i-:e:icy avians t the negative file: of adding 
combined ItrVfN (at-3--^) of btjtri missing and ntivscurcd 
d^ia ^Table 3). At the 25% level of additional poor data 
integrity, inbreds that were not related to the tr ue parent 
of the lubrid outranked the true parent for four of the 
five rubrics. Levels of missing or misscored data should, 
therefore, be kept below 15-20% (assuming a tevef"of 
5-10% error in the?data we analysed prior to simulating 
"additional error). 

We have previously examined the pedigrees of in- 
breds that are ranked into the first nvo positions when 
the true parents are removed from- the list of candidate 
inbred lines. Usually, direct progeny or full sisters of 
the Q-ue parents then rank most highly (data not pre- 
sented) . We diereforc examined the rankings of inbreds 
with respect to their ranking and probability of inclusion 
in the ancestry of each hybrid after the removal, 'not 
only of the true parents, but also of die progeny of the 
true parents and any full ^islets of the true parents. In 
these ctrcumM.a7'ices grandparents : of the hybrids are 
ranked predominantly into top positions. Using P 
0.50, a grandparent ranked into first position for 76% 
hybrids and into second position for 51% hybrids; with 
P ■= 0.99 a grandparent ranked into first place in 56% 
of hybrids. At P = 0.50 two grandparents ranked into 
first and second positions for 437c hybrids and into the 
first three positions for an additional 9% hybrids. Most 
of the remaining inbreds that ranked into die top cwo 
positions were progeny of the -grandparent. A total of 
108. grandparents ranked into the top five positions 
when P = 0.50; 95 ranked into these positions when P- 
0.99. Seventy-one grandparents ranked into the top nvo 
positions when P = 0-50; 55 grandparents ranked into 
these positions when P = 099, The mean probability 
of a grandparent in the top two positions was 0.92 S3 
(SD 0.1431) when P = 0.50 and 0.9980 (SD 0.0104) 
when P = 0.99. Our algorithm was written to identify 
pairs of ancestors; alternative algorithms could be tai- 
lored to identify' all grandparents ones parent had been 
identified and removed from the list of candidate in- 
breds. 

We have demonstrated the capability and robustness 
of an algorithm that can he used to show probability of 
parentage in circumstances where thc^a trnan pedigree 
identity of neither parent is known. Exclusions are taken 
info account, thereby alio whig parentage to be shown 
even when the two parents arc not represented in the 
da:aba>e of molecular profiles ttuu are examined. Het- 
ero/Tv^ous candidate parents ran be accommodated- 
The n umber of loci that is necessary io pr.-wide a reliable 
'■jasi^ of determining pedigrte ;s deprnrlem upon the 
'.ie.'^iee of relatediiess annmg parent and nrmparenrs 
-ipv-n the dtscnir.in.iro: \ abi:U\ of die markei >\ ^tetn 
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i;i tla - .*» petit* of uut.Tt.-t I'^in^ /' = O.'JM Lompvirccl to 
I 1 -.=. ( pt cici L lUlj.L/- ick llLiI il'I Hi', i t. U li p.n cuis iiiui 
wr.h a <<r<-\Ut'r diHcrciice of pr-ob;ihili:\ to third phtct'd 
nonpatent. [I (Jtcic rc^oti.ibio asv^nuicc tluiL the 
parents jrc umonif thr lmivIkUuc list of inbred*. th<::i 
P = 0.99 should :)e used; it greater rohustne^ re- 
quired, ihcn P = 0,50 should be* ustd. 

Applications of our algorithm include die idcnciBc,'.- 
tion of pedigrees among indiv iduals of plain or aninir.! 
species where molecular profile daiaseu e\Ls: that can 
be interpreted in term^s of segregating alleles at individ- 
ual marker loci and chat provide a suffkien: power of 
discrimination. Capabilities to generate large daasets 
*of suitable molecular profile data arc already available 
and are %incrca3in» rapidly with the advent of single 
nucleotide polymorphisms. One further application of 
our algorithm is to assist in the protection of intellectual 
property that is obtained on plant varieties or upon 
specific dams or sires of animals through the determina- 
tion of pedigrees. 
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